Using informative behavior to increase engagement while learning from human reward

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using informative behavior to increase engagement in the tamer framework

In this paper, we address a relatively unexplored aspect of designing agents that learn from human training by investigating how the agent’s non-task behavior can elicit human feedback of higher quality and quantity. We use the TAMER framework, which facilitates the training of agents by human-generated reward signals, i.e., judgements of the quality of the agent’s actions, as the foundation fo...

متن کامل

Reinforcement Learning from Demonstration and Human Reward

In this paper, we proposed a model-based method—IRL-TAMER— for combining learning from demonstration via inverse reinforcement learning (IRL) and learning from human reward via the TAMER framework. We tested our method in the Grid World domain and compared with the TAMER framework using different discount factors on human reward. Our results suggest that with one demonstration, although an agen...

متن کامل

Separating Skills from Preference: Using Learning to Program by Reward

Developers of artificial agents commonly take the view that we can only specify agent behavior via the expensive process of implementing new skills. This paper offers an alternative expressed by the separation hypothesis: that the behavioral differences among individuals are due to the action of distinct preferences over the same set of skills. We test this hypothesis in a simulated automotive ...

متن کامل

Using teacher greetings to increase speed to task engagement.

We used a multiple baseline design across participants to determine if teacher greetings would reduce the latency to task engagement. Three participants were identified by their respective teachers as having difficulty initiating task-appropriate engagement at the beginning of class. Latency was measured from teacher greeting until the participant was actively engaged for 5 consecutive seconds....

متن کامل

Asymptotic Behavior of Multivariate Reward Processes with Nonlinear Reward Functions

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Autonomous Agents and Multi-Agent Systems

سال: 2015

ISSN: 1387-2532,1573-7454

DOI: 10.1007/s10458-015-9308-2